Enable mutex functionality in nxsem #16194

jlaitine · 2025-04-11T12:52:06Z

Summary

This enables mutex functionality in the nxsem code, enabling wait/post atomic fast paths, without needs for syscall, also when priority inheritance is enabled for mutexes.

The basics of the implementations are as follows:

For mutexes, replace semaphore count with mutex holder, which consits of 1 bit for mutex internal locking and 31 bits for task PID
Instead of manipulating the count with atomic instructions, manipulate the mutex holder
Instead of allocating and storing the holder structure for the task's list for priority inheritance, do it at the time when task blocks on the mutex. Since there can be only one running holder for a mutex, this can be done.

Doing this allows cleaning up the nxmutex interface to be a very thin wrapper on top of nxsem (e.g. removing the extra holder variable inside of nxmutex. Plain nxmutex could be just a typedeffed nxsem). This is left for future PRs.

This also improves performance significantly in SMP and in CONFIG_BUILD_KERNEL / CONFIG_BUILD_PROTECTED targets, since most of the time there is no need for syscall to the kernel when a mutex is taken/posted.

I am putting this as a draft for now and will do some more testing still on qemu targets

Impact

This has no functional impact. For the performance, on my real application the speed improvements on RISC-V are as follows in terms of CPU utilization:

RISCV 64bit (MFPS SMP, 4 harts, CONFIG_BUILD_KERNEL): CPU usage: 36 % -> 25 %
RISCV 64bit (MFPS SMP, 4 harts, CONFIG_BUILD_FLAT): CPU usage: 12% -> 9%

Testing

Real hardware:

Custom HW, mpfs (risc-v 64-bit): non-SMP and SMP 4 harts, both with CONFIG_BUILD_FLAT and CONFIG_BUILD_KERNEL.
Custom HW, i.MX93: Single core CONFIG_BUILD_FLAT and CONFIG_BUILD_KERNEL

nuttxpr · 2025-04-11T13:04:44Z

[Experimental Bot, please feedback here]

Yes, this PR appears to meet the NuttX requirements, although it's marked as a draft and needs more testing. The summary clearly explains the "why," "what," and "how" of the changes. The impact section addresses key areas, including performance improvements and build configurations tested. The testing section provides information about the hardware and configurations used, though the logs are missing. The lack of "before" and "after" logs is the primary reason it doesn't fully meet the requirements yet. Adding those logs, completing testing on qemu, and removing the draft status will complete the PR.

jlaitine · 2025-04-12T05:37:59Z

This is an initial push, from which I forgot to squash one patch, so it is still broken when priority inheritance is disabled (all the testing I did was with prio inheritance enabled)

So please don't review yet, I will still fix it and add some cleanups.

This is just a FYI if someone wants to get a preliminary view ;)

include/nuttx/mutex.h

include/semaphore.h

include/nuttx/semaphore.h

jlaitine · 2025-04-15T06:38:54Z

Re-tested this, and I believe it is now ready for review. I disagreed on chaning the mholder into int32, so I didn't do that. It is obviously open for discussion still, but please see my points above first

jlaitine · 2025-04-15T07:22:58Z

Seems that I still need to work on this; citest is failing for some reason

lupyuen · 2025-04-15T09:39:18Z

@jlaitine I restarted the CI Test, let's see if it fails again...

libs/libc/semaphore/sem_trywait.c

libs/libc/semaphore/sem_wait.c

sched/semaphore/sem_holder.c

sched/semaphore/sem_destroy.c

sched/semaphore/sem_holder.c

sched/semaphore/sem_post.c

jlaitine · 2025-04-15T13:08:16Z

Fixed: bugs causing the citest to fail, review comments. Need to run a bunch of tests again.

jlaitine · 2025-05-06T06:24:29Z

It doesn't make sense to boost the thread which pass nxsem_wait on a counting semaphore, since the priority boost is meaningful only for a mutex(binary semaphore). That's why I suggest that we can drop the holder of counting semaphores directly.

I don't believe this is 100% true; currently the priority inheritance is supported also on counting semaphores. I also believe that if you really need to protect from priority inversion you'll need to have priority inheritance also on counting semaphores... Removing that would need quite strong agreement from the community that it is not really needed!

Since the support is there, I'd hesitate removing that. Let's leave it to the users to decide whether they want to disable priority inheritance on a counting semaphore or not.

xiaoxiang781216 · 2025-05-06T10:27:32Z

It doesn't make sense to boost the thread which pass nxsem_wait on a counting semaphore, since the priority boost is meaningful only for a mutex(binary semaphore). That's why I suggest that we can drop the holder of counting semaphores directly.

I don't believe this is 100% true; currently the priority inheritance is supported also on counting semaphores. I also believe that if you really need to protect from priority inversion you'll need to have priority inheritance also on counting semaphores...

Counting semaphore is normally waited(event) in one thread, but posted(event) in another thread(or even interrupt context), the holder doesn't make any sense and totally wrong in this case. The holder concept only make sense when the sem_wait and sem_post is called in the same thread(binary semaphore case).

@patacongo could you give some comment about this problem?

Removing that would need quite strong agreement from the community that it is not really needed!

Yes, let's vote on the dev list.

Since the support is there, I'd hesitate removing that, and leaving it to the users to decide whether they want to disable priority inheritance on a counting semaphore or not.

Yes, it should be belong another PR.

sched/semaphore/sem_waitirq.c

sched/semaphore/sem_post.c

include/nuttx/mutex.h

patacongo · 2025-05-06T13:15:16Z

@patacongo could you give some comment about this problem?

If this feature is removed, you will re-introduce priority inheritance issues and damage realtime performance. That is why that code is there in the first place.

POSIX does not specify the implementation of a realtime system and, hence, does not address these implementation issues.

POSIX permits implementations and features not addressed by POSIX. POSIX only addresses minimum funtions and only for application interfaces. It is completely permissible to support additional functionality. POSIX places no requirements on the internal implementation of an OS; only on the external, application interface provided by the OS.

Do not remove this functionality it is critical for the correct operation of the OS and would cause the loss of realtime performance.

xiaoxiang781216 · 2025-05-06T13:50:36Z

@patacongo could you give some comment about this problem?

If this feature is removed, you will re-introduce priority inheritance issues and damage realtime performance. That is why that code is there in the first place.

Only the multiple holders is removed, the single holder still track and adjust the priority as needed. Here is the reason why the multiple holders track is wrong and should be removed:

Counting semaphore is normally waited(event) in one thread, but posted(event) in another thread(or even interrupt context), the holder doesn't make any sense and totally wrong in this case. The holder concept only make sense when the sem_wait and sem_post is called in the same thread(binary semaphore case).

POSIX does not specify the implementation of a realtime system and, hence, does not address these implementation issues.

POSIX permits implementations and features not addressed by POSIX. POSIX only addresses minimum funtions and only for application interfaces. It is completely permissible to support additional functionality. POSIX places no requirements on the internal implementation of an OS; only on the external, application interface provided by the OS.

No, POSIX does define the priority inheritance and priority ceiling in the spec:
https://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutexattr_setprotocol.html
Both are implemented in NuttX now for pthread_mutex/mutex(binary semaphore).
But POSIX never define the priority inheritance and priority ceiling for counting semaphore.

Do not remove this functionality it is critical for the correct operation of the OS and would cause the loss of realtime performance.

Let's summary the suggestion:

Keep the binary semaphore priority inheritance and priority ceiling feature by tracking one holder
Remove the code which track multiple holder for counting semaphore since it never work as expect and impact the real time

This puts the mutex support fully inside nxsem, allowing locking the mutex and setting the holder with single atomic operation. This enables fast mutex locking from userspace, avoiding taking critical_sections, which may be heavy in SMP and cleanup of nxmutex library in the future. Signed-off-by: Jukka Laitinen <[email protected]>

… inheritance is enabled This enables the mutex fast path for nxsem_wait, nxsem_trywait and nxsem_post also when the priority inheritance is enabled. Signed-off-by: Jukka Laitinen <[email protected]>

- Remove the redundant holder, as nxsem now manages hoder TID - Remove DEBUGASSERTIONS which are managed in nxsem - Remove the "reset" handling logic, as it is now managed in nxsem - Inline the simplest functions Signed-off-by: Jukka Laitinen <[email protected]>

jlaitine · 2025-05-06T14:16:15Z

Remove the code which track multiple holder for counting semaphore since it never work as expect and impact the real time

"never work as expect" might not be correct. I do agree that in most common use case for a counting semaphore is signalling, and it is hard to find real world use cases where you would want the priority inheritance on counting semaphores.

Perhaps it could also be used as a resource counter where same thread waits and posts on a counting semaphore. Or a case where you want to allow exactly n number of threads execute the code.

Or, for some reason you want to use a counting semaphore as a mutex...

I'd like to see proof that it is really useless before doing anything as drastic as removing it completely, even if it would simplify many things.

Perhaps a better discussion point is, whether the priority inheritance should be enabled by default on counting semaphores or not?

pussuw · 2025-05-06T14:20:17Z

Counting semaphore is normally waited(event) in one thread, but posted(event) in another thread(or even interrupt context)

Yes in this case using priority inheritance is totally wrong. For signaling semaphores it will just not work.

But nothing prevents multiple threads taking a count on a counting semaphore that is not used for signaling, but is used e.g. as a resource counter. In this case if a higher priority thread cannot get the resource, you need to boost. If another even higher priority thread comes along, you need to boost again.

The use case for this is not obvious to me, so this problem could be superficial. I don't know of examples of this use, maybe there is some driver in the NuttX kernel that uses this ?

xiaoxiang781216 · 2025-05-06T14:28:33Z

Remove the code which track multiple holder for counting semaphore since it never work as expect and impact the real time

"never work as expect" might not be correct. I do agree that in most common use case for a counting semaphore is signalling, and it is hard to find real world use cases where you would want the priority inheritance on counting semaphores.

Perhaps it could also be used as a resource counter where same thread waits and posts on a counting semaphore. Or a case where you want to allow exactly n number of threads execute the code.

semaphore use as the resource counter also can't work with the holder concept too, since the resource acquire and release still happen in the different threads in most case.

Or, for some reason you want to use a counting semaphore as a mutex...

In this case, the semaphore should be a binary semaphore. Even you need the recessive mutex, you still need binary semaphore, not counting semaphore.

I'd like to see proof that it is really useless before doing anything as drastic as removing it completely, even if it would simplify many things.

Perhaps a better discussion point is, whether the priority inheritance should be enabled by default on counting semaphores or not?

To make the holder tracking work as expect, the key requirement is that sem_wait and sem_post is called in the same thread. using semaphore as event notification or resource counting can't match this requirement.
Also, I never hear any spec which define the priority inheritance for counting semaphore.

patacongo · 2025-05-06T14:28:41Z

But POSIX never define the priority inheritance and priority ceiling for counting semaphore.

As I said, the logic is there to support necessary priority inheritance in a realtime system. It is irrelevant that POSIX does not specify the internal behavior; that is purely up to the implementer of the OS.

If there are bugs, that is a difference issue: Priority inheritance must be retained and error must be fixed. Removing the necessary solution is not the correct way to go.

sched/semaphore/sem_wait.c

sched/semaphore/sem_post.c

xiaoxiang781216 · 2025-05-07T02:20:51Z

But POSIX never define the priority inheritance and priority ceiling for counting semaphore.

As I said, the logic is there to support necessary priority inheritance in a realtime system. It is irrelevant that POSIX does not specify the internal behavior; that is purely up to the implementer of the OS.

The priority inheritance support still here for binary semaphore(mutex), but the implementation is optimized more fast.

If there are bugs, that is a difference issue: Priority inheritance must be retained and error must be fixed. Removing the necessary solution is not the correct way to go.

The discussion is removing the priority inheritance support from the counting semaphore, since there is no well defined behavior for counting semaphore.

jlaitine · 2025-05-07T05:36:09Z

As this seems to be difficult subject, I'll try to explain how the priority boost and restore works here, and why the thread/TCB needs to keep a list of priority boosts. It is not really that complicated.

A thread (T1) may hold several mutexes (Mx) at a time (it can first wait on M1 then on M2, then on M3). While it is holding those, other higher priority threads (T2, T3) may boost the priority of T1 many times by blocking on mutexes M2, M3), . When the T1 posts the M3, it needs to find the correct priority to restore to. This information is not necessarily in M3, the correct restore priority may need to be determined by M1 or M2. This logic is (correctly) implemented in

nuttx/sched/semaphore/sem_holder.c

Line 413 in 358469e

for (pholder = htcb->holdsem; pholder != NULL;

So whenever a priority boost happens, the semaphore must be added to the current mutex holder's (tcb's) list in order to do proper priority restoration.

Here is one step by step example to make it even more simple:

Assume we have 3 threads:
T1, base priority 1
T2, base priority 2
T3, base priority 3

and 2 mutexes: M1 and M2. I am marking "wait on M1" as W(M1) and "post M1" as P(M1). When a thread 1 is running at priority x, I am marking it as T1(x), similarly for T2 and T3

T1(1) is running and does W(M1) , W(M2)
T2(2) gets to run, and does W(M1). It gets blocked and boosts T1 to prio 2
T1(2) continues running
T3(3) gets to run and does W(M2). It gets blocked and boosts T1 to prio 3
T1(3) continues running and does P(M2). Priority of T1 is restored to 2 and not its base priority since T1 is still holding M1, and T2 is blocked on that
T3(3) continues running until it is done and blocks on something else
T1(2) continues running and does P(M1). Priority of T1 is restored to 1
T2(2) continues running

Now, If you don't add the semaphore to the task's tcb->holdsem list, you go wrong on 5) and restore the T1 to it's base priority, causing priority inversion (any thread between T1 and T2 would block T2). Note that at 5), there is no information on M2 about what is the correct priority to restore to. The previous boosts must be tracked in the tcb, and this is done by adding the semaphore holder information to the tcb->holdsem list.

I hope this clarifies the subject?

Again, I am not re-writing any priority inheritance code (which is IMHO working correctly) in this PR. And I explained already earlier why you always need to add the holder information to the TCB:s list when the priority boost occurs, but how the semaphores own holder list is redundant for mutexes.

Please don't mix up these two things. I am not tracking here if the mutex is being held by many threads, this is obviously not happening. But a thread may hold several mutexes!

xiaoxiang781216 · 2025-05-07T07:15:12Z

OK, thanks for the detailed explanation, ti's clear to me that we need at least one holder for binary semaphore.

github-actions bot added Area: OS Components OS Components issues Size: L The size of the change in this PR is large labels Apr 11, 2025

acassis requested review from xiaoxiang781216, adamfeuer, anchao, raiden00pl, jerpelea, pkarashchenko and masayuki2009 April 11, 2025 14:14

acassis approved these changes Apr 11, 2025

View reviewed changes

jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch 2 times, most recently from e6dbdbe to 05ef6a8 Compare April 12, 2025 14:11

xiaoxiang781216 reviewed Apr 13, 2025

View reviewed changes

jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from 05ef6a8 to a9e50bd Compare April 15, 2025 06:33

jlaitine marked this pull request as ready for review April 15, 2025 06:38

jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from a9e50bd to 4b07113 Compare April 15, 2025 06:54